Skip to content

feat(milestone-3.0): PR review loop, multi-repo support, autonomous test generation, smarter triage#5

Merged
OgeonX-Ai merged 169 commits into
mainfrom
phase/1-foundation
Jun 5, 2026
Merged

feat(milestone-3.0): PR review loop, multi-repo support, autonomous test generation, smarter triage#5
OgeonX-Ai merged 169 commits into
mainfrom
phase/1-foundation

Conversation

@OgeonX-Ai

Copy link
Copy Markdown
Contributor

Summary

Milestone 3.0 — gsd-orchestrator Feature Expansion. Extends the orchestrator from a single-repo issue-to-PR automator into a multi-repo, triage-aware, test-generating, PR-reviewing autonomous engineering platform.

  • Phase 12 — Robustness Foundation: Serilog structured JSON logging on all state transitions; Polly v8 ratio-based circuit breaker on MCP tool calls; xUnit + NSubstitute test project with 7 deterministic tests covering all GsdStateMachine paths
  • Phase 13 — Smarter Issue Triage: TriagingState inserted before AnalyzingState — LLM classifies issues as actionable/needs-info/duplicate/out-of-scope with 3-attempt retry; --triage mode exits after classification; duplicate detection via list_issues context
  • Phase 14 — Autonomous Test Generation: TestGeneratingState executes after EditingState — LLM generates xUnit tests for code changes and commits them to the feature branch; ValidatingState verifies compilation
  • Phase 15 — PR Review Loop: ReviewingState dual-mode — --pr <N> fetches diff via MCP, calls Claude LLM (3-attempt JSON-parse retry), posts inline comments with severity (info/warning/error), submits APPROVE or REQUEST_CHANGES; legacy --issue path preserved
  • Phase 16 — Multi-Repo Support: GSD_REPOS JSON array replaces single owner/repo env vars (backwards compatible with GSD_GITHUB_OWNER/GSD_GITHUB_REPO fallback); watch mode loops all repos with per-repo rateLimitDelaySeconds; checkpoints namespaced {owner}_{repo}_{workflowId}.json; IdleState DI cleaned up; path-traversal hardened in FileCheckpointStore

Test plan

  • dotnet test — 35 tests, 0 failures (7 GsdStateMachine + 7 Triaging + 14 TestGenerating + 7 MultiRepoConfig)
  • --triage --issue <N> classifies a real issue and posts a comment
  • --pr <N> reads a real PR diff and posts an inline review
  • GSD_REPOS=[{"owner":"...","repo":"..."}] watch mode processes multiple repos in sequence
  • Single-repo env vars (GSD_GITHUB_OWNER + GSD_GITHUB_REPO) still work without GSD_REPOS

🤖 Generated with Claude Code

OgeonX and others added 30 commits May 21, 2026 15:56
All GitHub API operations verified live: topics PUT, description PATCH,
file create via Contents API. Org pin API confirmed absent (no REST/GraphQL).
Current repo state documented: 3 repos missing topics, LICENSE, descriptions.
.github profile README SHA captured for update.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…le, ci-autopilot exclusion

4 plans in 2 waves covering FOUND-01 through FOUND-05. Wave 1 (plans 01-03) runs
in parallel: topics+descriptions, LICENSE files, org profile README rewrite.
Wave 2 (plan 04) verifies and pins the portfolio repos with a manual checkpoint.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- RESEARCH.md: mark Open Questions as (RESOLVED) with inline RESOLVED markers for both questions; mark verify-phase1.sh Wave 0 gap as resolved via inline verify blocks
- Plan 01-03: replace outer triple-backtick markdown fence around README content with <readme_content> XML tags to prevent nested fence collision with inner mermaid block
- Plan 01-01: replace exact character count expectations (99/98/95) with generic '# Expected: < 100' to avoid incorrect assertions

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
4 plans in 2 waves covering FOUND-01 through FOUND-05:
- Wave 1: topics/descriptions, LICENSE files, org profile README (parallel)
- Wave 2: ci-autopilot exclusion verification + human pin checkpoint

Research confirmed: all GitHub API endpoints tested live, Promptimprover
uses master branch (targeted explicitly), org profile SHA documented.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…, autogen

- gsd-orchestrator: autonomous-agent, github-automation, dotnet, csharp, mcp, model-context-protocol, claude-ai, state-machine, agentic-ai, dotnet10
- Promptimprover: mcp, model-context-protocol, typescript, prompt-engineering, prompt-governance, rag, llm, mcp-server, enterprise-ai, ai-governance
- autogen: multi-agent, python, microsoft-autogen, gemini, claude-ai, agent-framework, agentic-ai, ai-automation, ag-ui, llm
- All three repos verified: 10 topics each, key topics confirmed via API
Plan 01-01: 10 topics + enterprise descriptions on all 3 repos (FOUND-01, FOUND-05)
Plan 01-02: MIT LICENSE on gsd-orchestrator/main, Promptimprover/master, autogen/main (FOUND-03)
Plan 01-03: Org profile README rewritten with Mermaid system diagram (FOUND-02)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
All 5 requirements verified (FOUND-01 through FOUND-05):
- 10 GitHub topics on all 3 repos
- Org profile README rewritten with Mermaid system diagram
- MIT LICENSE on all 3 repos (correct branches)
- ci-autopilot not visible (push position 8+)
- All descriptions <100 chars, enterprise-framed

Pending manual: pin repos via GitHub UI (no API available)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Investigated GitHub Actions .NET 10 setup-dotnet pinning (10.0.1xx
feature band to avoid MSBuild 18 requirement), stateDiagram-v2 and
flowchart LR Mermaid syntax, native GitHub badge URL format, shields.io
static badge format, and dotenv.net build-vs-runtime behavior.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2 plans across 2 waves: Wave 1 creates ci.yml; Wave 2 inserts badges
and Mermaid diagrams into README. Closes GSD-01, GSD-02, GSD-03, GSD-09.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ated on gsd-orchestrator

- ci.yml committed to Coding-Autopilot-System/gsd-orchestrator main (commit 2056d8e)
- GitHub Actions CI run triggered (in_progress)
- All 10 acceptance criteria verified
- CI/build, .NET 10, MIT License badges added to README
- stateDiagram-v2 (9 states, direction LR, no transition labels) added
- flowchart LR component topology (4 subgraphs, 7 edges) added
- All 32 acceptance criteria verified PASS
…rified

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 plans across 2 execution waves:
- 03-00: manual checkpoint to initialize wiki.git via GitHub web UI
- 03-01: clone wiki repo and push all 4 pages (Home, Setup Guide, Architecture, Config Reference)
- 03-02: create GitHub Release v1.0.0 with feature-narrative release notes

Closes planning for GSD-04, GSD-05, GSD-06, GSD-07, GSD-08.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Create 03-VALIDATION.md — Nyquist validation contract for documentation phase
- Mark RESEARCH.md Open Questions as RESOLVED (wiki branch = master, fallback HEAD)
- Update STATE.md — Phase 3 status: ready to execute (3 plans, 3 waves)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…atisfied

- Four wiki pages pushed to Coding-Autopilot-System/gsd-orchestrator.wiki.git (commit d68096c)
- Home.md: hero, CI/.NET10/MIT badges, 5-line quick-start, navigation table
- Setup-Guide.md: standalone guide with .gsd/state/ checkpoint path, successful run output
- Architecture.md: stateDiagram-v2 + flowchart LR (verbatim from README) + Data Flow narrative
- Configuration-Reference.md: all 7 env vars in 3 grouped tables
- Requirements GSD-04, GSD-05, GSD-06, GSD-07 marked complete
- GitHub Release v1.0.0 created at Coding-Autopilot-System/gsd-orchestrator
- Feature-narrative release notes: autonomous, state machine, Polly, .NET 10
- All five Phase 3 requirements closed (GSD-04–08)
- STATE.md: Phase 3 marked complete, progress 100%
- ROADMAP.md: 03-02 checked off
- REQUIREMENTS.md: GSD-08 marked complete
… GSD-08 satisfied

Phase 3 (gsd-orchestrator Wiki & Release) complete:
- 4 wiki pages live: Home, Setup Guide, Architecture, Configuration Reference
- GitHub Release v1.0.0 published with feature-narrative release notes
- VERIFICATION.md: 9/10 automated checks passed, human verified diagrams + release render

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Mark gsd-orchestrator Wiki and v1.0.0 release as validated requirements.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Investigated Promptimprover repo: verified CI strategy (Windows build script
pitfall), wiki not initialized (has_wiki: false), default branch is master,
all source components documented for Architecture wiki page accuracy.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
OgeonX and others added 28 commits June 5, 2026 18:30
…tion to prevent injection and context overflow
…lowModels.cs

- Add RepoConfig sealed record with Owner, Repo, RateLimitDelaySeconds (default 30)
- Add RepoConfigLoader static class with Load(IConfiguration) stub throwing NotImplementedException
- Add using Microsoft.Extensions.Configuration to support IConfiguration parameter type
- MULTI-01 contract scaffolded; Wave 2 will implement Load()
…GREEN test stubs

- Add StatePath(owner, repo, workflowId) overload producing {owner}_{repo}_{workflowId}.json filenames
- Update SaveAsync to use new namespaced overload via ctx.Issue?.RepoOwner/RepoName
- Retain original StatePath(workflowId) for LoadAsync/ArchiveAsync (backwards compat)
- Create MultiRepoConfigTests.cs with 7 [Fact] tests: tests 1,2,4,5 RED (NotImplementedException), tests 3,6 pass ThrowsAny (stub throws), test 7 GREEN (checkpoint naming verified)
- MULTI-01 contract: 6 tests define RepoConfigLoader.Load() behaviour for Wave 2
- MULTI-03 satisfied: SaveAsync now writes {owner}_{repo}_{workflowId}.json namespaced files
- Fixed xUnit1031 warning: Test 7 uses async/await instead of .GetAwaiter().GetResult()
…namespaced checkpoints + 7 RED/GREEN tests

- RepoConfig record + RepoConfigLoader stub (MULTI-01 contract defined)
- FileCheckpointStore.StatePath namespaced to {owner}_{repo}_{workflowId}.json (MULTI-03 satisfied)
- 7 TDD test stubs: 4 RED (NotImplementedException), 3 pass (2 ThrowsAny + 1 GREEN checkpoint test)
- All 28 prior tests remain GREEN
…ion from IdleState

- RepoConfigLoader.Load(): parse GSD_REPOS JSON array; fallback to GSD_GITHUB_OWNER+GSD_GITHUB_REPO; throw if neither present
- Add private RepoConfigDto sealed record for JSON deserialization
- IdleState: remove IConfiguration field and constructor param; read owner/repo from ctx.Issue!.RepoOwner/RepoName
- FileCheckpointStore.StatePath: sanitize owner/repo segments (T-16-05, path traversal mitigation)
- Replace single-repo owner/repo reads with RepoConfigLoader.Load(config) (MULTI-01)
- Watch mode: foreach loop over repos calling RunWatchModeAsync per repo (MULTI-02)
- Single-issue mode: use repos[0] as primary repo (MULTI-01 backwards compat)
- PR mode: use repos[0] as primary repo
- Add rateLimitDelaySeconds parameter to RunWatchModeAsync (MULTI-04)
- Add inter-issue Task.Delay(rateLimitDelaySeconds) inside watch loop (MULTI-04)
- All 35 tests GREEN (7 GsdStateMachineTests + 7 TriagingStateTests + 14 TestGeneratingStateTests + 7 MultiRepoConfigTests)
…, MULTI-01..04 satisfied

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
CR-01: sanitize workflowId in single-arg StatePath and three-arg overload
CR-02: replace character-denylist Sanitize with allowlist regex [^a-zA-Z0-9\-_]
CR-03: LoadAsync and ArchiveAsync scan for *_{workflowId}.json as fallback
WR-03: compute _archiveDir once in constructor via Path.GetFullPath

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Test 3: Assert.Throws<InvalidOperationException> (missing config)
Test 6: Assert.Throws<System.Text.Json.JsonException> (malformed JSON)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Prevents external mutation of the audit trail via .Add()/.Clear() etc.
The Transition() spread operator works on any IEnumerable so no internal change needed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Extracts RepoConfigLoader and RepoConfigDto from WorkflowModels.cs into
RepoConfigLoader.cs in the same namespace, satisfying single-responsibility.
WorkflowModels.cs now contains only data type records/enums.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@OgeonX-Ai OgeonX-Ai merged commit 42d1432 into main Jun 5, 2026
1 check passed

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cf88b3ba0e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +161 to +162
await RunWatchModeAsync(sm, mcpDispatcher, repoConfig.Owner, repoConfig.Repo,
repoConfig.RateLimitDelaySeconds, logger, cts.Token);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Don't block multi-repo watch on the first repo

When GSD_REPOS contains more than one repo, this await never reaches the second element because RunWatchModeAsync contains its own while (!ct.IsCancellationRequested) polling loop and only returns on cancellation. In --watch mode that means only the first configured repository is ever polled, so the new multi-repo support is effectively disabled for every later repo.

Useful? React with 👍 / 👎.

Comment on lines +283 to +285
// get_pull_request returns PR metadata JSON; treat the full JSON payload as the
// diff context for the LLM prompt (ReviewingState uses ctx.PrReview.Diff directly).
diff = diffResult.Text;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Fetch the actual PR diff before invoking the reviewer

For any --pr run, this stores the get_pull_request metadata JSON as PrReviewContext.Diff, and ReviewingState then prompts the LLM to review that value as a git diff. The metadata payload only describes the PR (for example title/body/diff URL), so the reviewer never sees the changed hunks and can approve or request comments without inspecting the code; the PR diff needs to be fetched explicitly before building the review context.

Useful? React with 👍 / 👎.

Comment on lines +9 to +13
Triaging, // Phase 13: issue classification before analysis
Analyzing,
Branching,
Editing,
TestGenerating, // Phase 14: generate xUnit tests for edited source files

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep serialized workflow state values stable

These inserted enum members shift the numeric values of all later states, while checkpoints are serialized with the default System.Text.Json enum handling (no string enum converter is configured in FileCheckpointStore). Any active checkpoint written before this commit with CurrentState of Analyzing or later will deserialize to the wrong state after upgrade, so --resume can re-enter the wrong workflow step or skip steps entirely.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants